KMC 3: counting and manipulating k-mer statistics
نویسندگان
چکیده
منابع مشابه
KMC 3: counting and manipulating k-mer statistics
Summary Counting all k -mers in a given dataset is a standard procedure in many bioinformatics applications. We introduce KMC3, a significant improvement of the former KMC2 algorithm together with KMC tools for manipulating k -mer databases. Usefulness of the tools is shown on a few real problems. Availability and implementation Program is freely available at http://sun.aei.polsl.pl/REFRESH/k...
متن کاملKMC 2: fast and resource-frugal k-mer counting
MOTIVATION Building the histogram of occurrences of every k-symbol long substring of nucleotide data is a standard step in many bioinformatics applications, known under the name of k-mer counting. Its applications include developing de Bruijn graph genome assemblers, fast multiple sequence alignment and repeat detection. The tremendous amounts of NGS data require fast algorithms for k-mer count...
متن کاملStatistics for K-mer Based Splicing Analysis
It is well acknowledged that alternative splicing module plays a crucial role to identify the variations of the RNA transcriptomes. In high-throughput short-read RNA, splicing analysis is a challenging task due to the uncertainty and time complexity of reads alignments onto genome and transcriptome. In this paper, we introduce k-mer based statistical method for splicing event analysis. The k-me...
متن کاملSqueakr: an exact and approximate k-mer counting system
Motivation k-mer-based algorithms have become increasingly popular in the processing of high-throughput sequencing data. These algorithms span the gamut of the analysis pipeline from k-mer counting (e.g. for estimating assembly parameters), to error correction, genome and transcriptome assembly, and even transcript quantification. Yet, these tasks often use very different k-mer representations ...
متن کاملMultiple comparative metagenomics using multiset k-mer counting
Background. Large scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can be associated to known organisms. On the other hand, de novo methods, that compare the w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Bioinformatics
سال: 2017
ISSN: 1367-4803,1460-2059
DOI: 10.1093/bioinformatics/btx304